Using Python, produce a listing of the files in the subdirectory data of geogg122/Chapter3_Scientific_Numerical_Python that end with .nc and put this listing in a file called files/data/data.dat with each entry on a different line
Hopefully, you should already be in the directory geogg122/Chapter3_Scientific_Numerical_Python, if not, you may like to go there before starting this exercise.
If you were to do this from unix, you would getr the listing with:
!ls -l data/*.nc
!ls -l data/*.nc
or similar.
To do this in Python, you should use glob, and the same 'pattern':
import glob
import glob
files = glob.glob('data/*.nc')
print files
This is a list. We want to write this to a file called files/data/data.dat.
First we open it, then simply use writelines to write the list of strings, the close the file.
filename = 'files/data/data.dat'
filename = 'files/data/data.dat'
# open in write modefp = open(filename,'w')
fp.writelines(files)
fp.close()
Hopefully, that all worked well, but just to check from unix:
!cat files/data/data.dat
!cat files/data/data.dat
which isn't quite what we wanted: we need to insert a newline character at the end of each string before writing.
There are several ways to do this, e.g.:
files = glob.glob('data/*.nc')
files = glob.glob('data/*.nc')
for i,file in enumerate(files):
files[i] = file + '\n'
print files
files = glob.glob('data/*.nc')
files = glob.glob('data/*.nc')
# or:files = [file + '\n' for file in files]
print files
# or all at once if you like:
# or all at once if you like:files = [file + '\n' for file in glob.glob('data/*.nc')]
print files
or several other ways ...
Putting this together:
import glob
import glob
files = [file + '\n' for file in glob.glob('data/*.nc')]
filename = 'files/data/data.dat'
# open in write modefp = open(filename,'w')
fp.writelines(files)
fp.close()
then checking:
!cat files/data/data.dat
!cat files/data/data.dat
which is what we wanted.
You can sort of make movies in pylab,
but you generally have to make a system call to unix at some point, so
it's probably easier to do this all in unix with the utility convert.
At the unix prompt, chack that you have access to convert:
berlin% which convert
/usr/bin/convert
If this doesn't come up with anything useful, there is probably a version in /usr/bin/convert or /usr/local/bin/convert (If you don't have it on your local machine, install ImageMagick which contains the command line tool convert).
To use this, e.g.:
from the unix command line:
berlin% cd ~/Data/geogg122/Chapter3_Scientific_Numerical_Python
berlin% convert files/data/albedo.jpg files/data/albedo.gif
or from within a notebook:
!convert files/data/albedo.jpg files/data/albedo.gif
!convert files/data/albedo.jpg files/data/albedo.gif
Or, more practically here, you can run a unix command directly from Python:
import os
import os
cmd = 'convert files/data/albedo.jpg files/data/albedo.gif'
os.system(cmd)
This will convert the file files/data/albedo.jpg (in jpeg format) to files/data/albedo.gif (in gif format).

We can also use convert to make animated gifs, which is one way of making a movie.
You
have all of the code you need above to be able to read a GlobAlbedo
file for a given month and waveband in Python and save a picture in jpeg
format, but to recap for BHR_VIS:
from netCDF4 import Dataset
from netCDF4 import Dataset
import pylab as plt
import os
root = 'files/data/'
month_list = ['January','February','March','April','May','June',\
'July','August','September','October','November','December']
# make a dictionary from 2 listsmonth_dict = dict(zip(range(1,13),month_list))
# example filename : use formatting string:# %d%02dyear = 2009
# set the monthmonth = 1
''' Read the data '''local_file = root + 'GlobAlbedo.%d%02d.mosaic.5.nc'%(year,month)
# load the netCDF data from the file f.filenamenc = Dataset(local_file,'r')
band = nc.variables['DHR_VIS']
''' Plot the data and save as picture jpeg format '''# make a string with the output file nameout_file = root + 'GlobAlbedo.%d%02d.jpg'%(year,month)
# plotplt.figure(figsize=(16, 8))
plt.clf()
# %9s forces the string to be 9 characters longplt.title('VIS BHR albedo for %9s %d'%(month_dict[month],year))
# use nearest neighbour interpolationplt.imshow(band,interpolation='nearest',cmap=plt.get_cmap('Spectral'),vmin=0.0,vmax=1.0)
# show a colour bar plt.colorbar()
plt.savefig(out_file)
''' Convert the file to gif '''# set up the unix command which is of the form # convert input output# Here input will be out_file# and output we can get with out_file.replace('.jpg','.gif')# i.e. replacing where it says .jpg with .gifcmd = 'convert %s %s'%(out_file,out_file.replace('.jpg','.gif'))
os.system(cmd)
Modify the code above to loop over each month, so that it generates a set of gif format files for the TOTAL SHORTWAVE ALBEDO
You should confirm that these exist, and that the file modification time is when you ran it (not when I generated the files for these notes, which is Oct 10 2013).
ls -l files/data/GlobAlbedo*gif
ls -l files/data/GlobAlbedo*gif
Really all you need to do here is to make month appear in a loop, e.g. using:
for month in range(1,13):
and then make sure that all of the code below is in that loop (i.e. indented) as below.
One additional thing is to make sure you select the waveband you were supposed to (shortwave albedo)
band = nc.variables['DHR_SW']
and finally, make sure you change the title:
You should also however, go through the code above line by line, making sure you appreciate what is going on at each stage and why we have done these things (in this oroder).
from netCDF4 import Dataset
from netCDF4 import Dataset
import pylab as plt
import os
root = 'files/data/'
month_list = ['January','February','March','April','May','June',\
'July','August','September','October','November','December']
# make a dictionary from 2 listsmonth_dict = dict(zip(range(1,13),month_list))
# example filename : use formatting string:# %d%02dyear = 2009
# set the monthfor month in range(1,13):
''' Read the data '''local_file = root + 'GlobAlbedo.%d%02d.mosaic.5.nc'%(year,month)
# load the netCDF data from the file f.filenamenc = Dataset(local_file,'r')
# select which band we wantband = nc.variables['DHR_SW']
''' Plot the data and save as picture jpeg format ''' # make a string with the output file nameout_file = root + 'GlobAlbedo.%d%02d.jpg'%(year,month)
# plotplt.figure(figsize=(16, 8))
plt.clf()
# %9s forces the string to be 9 characters longplt.title('SW BHR albedo for %9s %d'%(month_dict[month],year))
# use nearest neighbour interpolationplt.imshow(band,interpolation='nearest',cmap=plt.get_cmap('Spectral'),vmin=0.0,vmax=1.0)
# show a colour bar plt.colorbar()
plt.savefig(out_file)
''' Convert the file to gif ''' # set up the unix command which is of the form # convert input output # Here input will be out_file # and output we can get with out_file.replace('.jpg','.gif') # i.e. replacing where it says .jpg with .gifcmd = 'convert %s %s'%(out_file,out_file.replace('.jpg','.gif'))
os.system(cmd)
The unix command to convert these files to an animated gif is:
convert -delay 100 -loop 0 files/data/GlobAlbedo.2009??.gif files/data/GlobAlbedo.2009.SW.gif
Run this (ideally, from within Python) to create the animated gif GlobAlbedo.2009.SW.gif
Again, confirm that you created this file (and it is not just a version you downloaded):
ls -l files/data/GlobAlbedo.2009.SW.gif
ls -l files/data/GlobAlbedo.2009.SW.gif
You could just type the command at the unix prompt ... but to do it using a system call from within Python, e.g.:
import os
import os
# this is quite a neat way of generating the string for the input filesout_file = 'files/data/GlobAlbedo.%d.SW.gif'%year
in_files = out_file.replace('.SW.gif','??.gif')
cmd = 'convert -delay 100 -loop 0 %s %s'%(in_files,out_file)
# check the cmd is okprint cmd
# good ... so now run it
# good ... so now run itos.system(cmd)

To view the animated gif you have generated, open it in a browser.
from netCDF4 import Dataset
from netCDF4 import Dataset
import numpy as np
root = 'files/data/'
year = 2009
# which months?months = xrange(1,13)
# empty listdata = []# loop over month# use enumerate so we have an index counterfor i,month in enumerate(months):
# this then is the file we wantlocal_file = root + 'GlobAlbedo.%d%02d.mosaic.5.nc'%(year,month)
# load the netCDF data from the file local_filenc = Dataset(local_file,'r')
# append what we read to the list called datadata.append(np.array(nc.variables['DHR_SW']))
# convert data to a numpy array (its a list of arrays at the moment)data = np.array(data)
N.B. Do this exercise before proceeding to the next section.
Taking the code above as a starting point, generate a masked array of the GlobAlbedo dataset for the year 2009.
To recap, to make a masked array, we use some code such as:
import numpy.ma as ma
import numpy.ma as ma
band = np.array(nc.variables['DHR_SW'])
masked_band = ma.array(band,mask=np.isnan(band))
print masked_band.mask
So we can do this for each band as we loop over each month:
from netCDF4 import Dataset
from netCDF4 import Dataset
import numpy as np
import numpy.ma as ma
root = 'files/data/'
year = 2009
# which months?months = xrange(1,13)
# empty listdata = []# loop over month# use enumerate so we have an index counterfor i,month in enumerate(months):
# this then is the file we wantlocal_file = root + 'GlobAlbedo.%d%02d.mosaic.5.nc'%(year,month)
# load the netCDF data from the file local_filenc = Dataset(local_file,'r')
# load into the variable 'band'band = np.array(nc.variables['DHR_SW'])
# convert to a masked arraymasked_band = ma.array(band,mask=np.isnan(band))
# append what we read to the list called datadata.append(masked_band)
# convert data to a numpy array (its a list of arrays at the moment)data = np.array(data)
That's a good start, but we used:
data = np.array(data)
at the end, which means that we have a 3D numpy array ... rather than a masked array:
print type(data)
print type(data)
print data.shape
print data.ndim
so we need to replace this with a function to make it into a masked array:
from netCDF4 import Dataset
from netCDF4 import Dataset
import numpy as np
import numpy.ma as ma
root = 'files/data/'
year = 2009
# which months?months = xrange(1,13)
# empty listdata = []# loop over month# use enumerate so we have an index counterfor i,month in enumerate(months):
# this then is the file we wantlocal_file = root + 'GlobAlbedo.%d%02d.mosaic.5.nc'%(year,month)
# load the netCDF data from the file local_filenc = Dataset(local_file,'r')
# load into the variable 'band'band = np.array(nc.variables['DHR_SW'])
# convert to a masked arraymasked_band = ma.array(band,mask=np.isnan(band))
# append what we read to the list called datadata.append(masked_band)
# convert data to a numpy array (its a list of arrays at the moment)data = ma.array(data)
print type(data)
print type(data)
print data.shape
print data.ndim
we might notice that now, the data mask (data.mask) is a 3D mask:
print data.mask.shape
print data.mask.shape
print data.mask.ndim
and we might just check that the mask is the same for all months:
First, lets sum the masks over axis 0 (i.e. over all months):
sum = (data.mask).sum(axis=0)
sum = (data.mask).sum(axis=0)
print sum.shape
print sum
print sum
This should be 12 (where mask is True) or 0 (where mask is False). How can we chack that quickly?
Fortunately, there is a convenient numpy function np.unique that will give you the unique values in an array:
np.unique(sum)
np.unique(sum)
The only values in here are 12 and 0, so the masks must be consistent!